23 research outputs found

    Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech

    Get PDF
    We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speech-act-like units such as Statement, Question, Backchannel, Agreement, Disagreement, and Apology. Our model detects and predicts dialogue acts based on lexical, collocational, and prosodic cues, as well as on the discourse coherence of the dialogue act sequence. The dialogue model is based on treating the discourse structure of a conversation as a hidden Markov model and the individual dialogue acts as observations emanating from the model states. Constraints on the likely sequence of dialogue acts are modeled via a dialogue act n-gram. The statistical dialogue grammar is combined with word n-grams, decision trees, and neural networks modeling the idiosyncratic lexical and prosodic manifestations of each dialogue act. We develop a probabilistic integration of speech recognition with dialogue modeling, to improve both speech recognition and dialogue act classification accuracy. Models are trained and evaluated using a large hand-labeled database of 1,155 conversations from the Switchboard corpus of spontaneous human-to-human telephone speech. We achieved good dialogue act labeling accuracy (65% based on errorful, automatically recognized words and prosody, and 71% based on word transcripts, compared to a chance baseline accuracy of 35% and human accuracy of 84%) and a small reduction in word recognition error.Comment: 35 pages, 5 figures. Changes in copy editing (note title spelling changed

    Building Greibach Normal Form Grammars Using Genetic Algorithms

    No full text
    Grammatical inference of context-free grammars using positive and negative language examples is among the most challenging task in modern artificial and natural language technology. Recently, several implementations combining various techniques, usually including the Backus–Naur form, have been proposed. In this paper, we explore a new implementation of grammatical inference using evolution methods focused on the Greibach normal form and exploiting its properties, and also propose new solutions both in the evolutionary processes and in the corresponding fitness estimation

    Design of Automatic Target Recognition system based on multistatic passive RADAR

    No full text
    Nowadays, a great number of researchers are concerned about aerial targets since they are involved in many aspects of everyday life, such as military issues, including aircrafts and missiles, or Unmanned Aerial Vehicles (UAVs) flying over city centers or airport areas. The design of an efficient Automatic Target Recognition (ATR) system has been an attractive problem and for this reason many researchers rely on experiments to extract the necessary data for the ATR system. In this paper, a new method for extracting the radar cross-section (RCS) data has been proposed, which uses the Boundary Element Method (BEM) to efficiently compute the RCS values of different objects at any point of the coordinate system in a short period of time, without need of any experiment. Multiple RCS values in the presence of noise are used for the training and the evaluation of two ATR systems, based on the nearest neighbor classification rule or a multilayer Neural Network

    Finger Vein Segmentation from Infrared Images Based on a Modified Separable Mumford Shah Model and Local Entropy Thresholding

    Get PDF
    A novel method for finger vein pattern extraction from infrared images is presented. This method involves four steps: preprocessing which performs local normalization of the image intensity, image enhancement, image segmentation, and finally postprocessing for image cleaning. In the image enhancement step, an image which will be both smooth and similar to the original is sought. The enhanced image is obtained by minimizing the objective function of a modified separable Mumford Shah Model. Since, this minimization procedure is computationally intensive for large images, a local application of the Mumford Shah Model in small window neighborhoods is proposed. The finger veins are located in concave nonsmooth regions and, so, in order to distinct them from the other tissue parts, all the differences between the smooth neighborhoods, obtained by the local application of the model, and the corresponding windows of the original image are added. After that, veins in the enhanced image have been sufficiently emphasized. Thus, after image enhancement, an accurate segmentation can be obtained readily by a local entropy thresholding method. Finally, the resulted binary image may suffer from some misclassifications and, so, a postprocessing step is performed in order to extract a robust finger vein pattern

    Language Inference Using Elman Networks with Evolutionary Training

    No full text
    In this paper, a novel Elman-type recurrent neural network (RNN) is presented for the binary classification of arbitrary symbol sequences, and a novel training method, including both evolutionary and local search methods, is evaluated using sequence databases from a wide range of scientific areas. An efficient, publicly available, software tool is implemented in C++, accelerating significantly (more than 40 times) the RNN weights estimation process using both simd and multi-thread technology. The experimental results, in all databases, with the hybrid training method show improvements in a range of 2% to 25% compared with the standard genetic algorithm
    corecore